{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "21cfea06-af98-496a-b13b-106c335a2e65",
   "metadata": {},
   "source": [
    "# Querying by object\n",
    "\n",
    "It is possible to simply run a query with an object from our storage that results in the most relevant items to the said object."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "1bf67328-efe5-4c88-9c36-ce2d4b20d89f",
   "metadata": {},
   "outputs": [],
   "source": [
    "%pip install superlinked==3.38.0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "5a25f839-c109-4447-adbe-0de4ff6f39cf",
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "\n",
    "from superlinked.framework.common.schema.id_schema_object import IdField\n",
    "from superlinked.framework.common.schema.schema import schema\n",
    "from superlinked.framework.common.schema.schema_object import String\n",
    "from superlinked.framework.dsl.index.index import Index\n",
    "from superlinked.framework.dsl.space.text_similarity_space import TextSimilaritySpace\n",
    "\n",
    "from superlinked.framework.dsl.executor.in_memory.in_memory_executor import (\n",
    "    InMemoryExecutor,\n",
    ")\n",
    "from superlinked.framework.dsl.source.in_memory_source import InMemorySource\n",
    "from superlinked.framework.dsl.query.query import Query\n",
    "\n",
    "pd.set_option(\"display.max_colwidth\", 100)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "08f71264-4cf4-405b-9f55-7ea2863c7924",
   "metadata": {},
   "outputs": [],
   "source": [
    "@schema\n",
    "class Paragraph:\n",
    "    id: IdField\n",
    "    body: String\n",
    "\n",
    "\n",
    "paragraph = Paragraph()\n",
    "\n",
    "body_space = TextSimilaritySpace(\n",
    "    text=paragraph.body, model=\"sentence-transformers/all-mpnet-base-v2\"\n",
    ")\n",
    "paragraph_index = Index(body_space)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "30e0ec2b-540b-4f77-8c7f-96c768f2029a",
   "metadata": {},
   "source": [
    "Now let's add some data to our space and fire up a running executor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "979eaa8a-3cec-4340-bbf2-776f3c79dd67",
   "metadata": {},
   "outputs": [],
   "source": [
    "source: InMemorySource = InMemorySource(paragraph)\n",
    "executor = InMemoryExecutor(sources=[source], indices=[paragraph_index])\n",
    "app = executor.run()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "0d0e746f-fd42-4364-8453-320cf7c20035",
   "metadata": {},
   "outputs": [],
   "source": [
    "source.put(\n",
    "    [\n",
    "        {\"id\": \"paragraph-1\", \"body\": \"Glorious animals live in the wilderness.\"},\n",
    "        {\n",
    "            \"id\": \"paragraph-2\",\n",
    "            \"body\": \"Growing computation power enables advancements in AI.\",\n",
    "        },\n",
    "        {\n",
    "            \"id\": \"paragraph-3\",\n",
    "            \"body\": \"The flora and fauna of a specific habitat highly depend on the weather.\",\n",
    "        },\n",
    "    ]\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c24cdb32-fa33-4fbd-82a6-48527cc8e8d5",
   "metadata": {},
   "source": [
    "## Using the .with_vector clause\n",
    "\n",
    "Provides the opportunity to search with the vector of an object in our database. This is useful for example for recommending items for a user based on it's vector."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "a2ed953c-4376-4cdc-a2c8-eba93fdeb622",
   "metadata": {},
   "outputs": [],
   "source": [
    "query = Query(paragraph_index).find(paragraph).with_vector(paragraph, \"paragraph-3\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "36ff4b5c-82ae-4555-a2f6-9bad9b4e73f3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>body</th>\n",
       "      <th>id</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>The flora and fauna of a specific habitat highly depend on the weather.</td>\n",
       "      <td>paragraph-3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Glorious animals live in the wilderness.</td>\n",
       "      <td>paragraph-1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Growing computation power enables advancements in AI.</td>\n",
       "      <td>paragraph-2</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                                                                      body  \\\n",
       "0  The flora and fauna of a specific habitat highly depend on the weather.   \n",
       "1                                 Glorious animals live in the wilderness.   \n",
       "2                    Growing computation power enables advancements in AI.   \n",
       "\n",
       "            id  \n",
       "0  paragraph-3  \n",
       "1  paragraph-1  \n",
       "2  paragraph-2  "
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "result = app.query(query)\n",
    "\n",
    "result.to_pandas()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d773d9ea-fd4d-4657-8968-5bd2e1c87a07",
   "metadata": {},
   "source": [
    "The first result is the one we are searching with, the second is the more related one, and finally the less connected paragraph body comes."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "superlinked-py3.10",
   "language": "python",
   "name": "superlinked-py3.10"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
